Picture for Shijian Lu

Shijian Lu

Nanyang Technological University

E.M.Ground: A Temporal Grounding Vid-LLM with Holistic Event Perception and Matching

Add code
Feb 05, 2026
Viaarxiv icon

Cross-Domain Few-Shot Segmentation via Multi-view Progressive Adaptation

Add code
Feb 05, 2026
Viaarxiv icon

Boosting SAM for Cross-Domain Few-Shot Segmentation via Conditional Point Sparsification

Add code
Feb 05, 2026
Viaarxiv icon

A Comprehensive Study on Visual Token Redundancy for Discrete Diffusion-based Multimodal Large Language Models

Add code
Nov 19, 2025
Viaarxiv icon

Spatial Preference Rewarding for MLLMs Spatial Understanding

Add code
Oct 16, 2025
Viaarxiv icon

UniMRSeg: Unified Modality-Relax Segmentation via Hierarchical Self-Supervised Compensation

Add code
Sep 19, 2025
Viaarxiv icon

H$_{2}$OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers

Add code
Sep 08, 2025
Figure 1 for H$_{2}$OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers
Figure 2 for H$_{2}$OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers
Figure 3 for H$_{2}$OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers
Figure 4 for H$_{2}$OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers
Viaarxiv icon

PacGDC: Label-Efficient Generalizable Depth Completion with Projection Ambiguity and Consistency

Add code
Jul 10, 2025
Viaarxiv icon

UniDet-D: A Unified Dynamic Spectral Attention Model for Object Detection under Adverse Weathers

Add code
Jun 14, 2025
Viaarxiv icon

ToDRE: Visual Token Pruning via Diversity and Task Awareness for Efficient Large Vision-Language Models

Add code
May 24, 2025
Figure 1 for ToDRE: Visual Token Pruning via Diversity and Task Awareness for Efficient Large Vision-Language Models
Figure 2 for ToDRE: Visual Token Pruning via Diversity and Task Awareness for Efficient Large Vision-Language Models
Figure 3 for ToDRE: Visual Token Pruning via Diversity and Task Awareness for Efficient Large Vision-Language Models
Figure 4 for ToDRE: Visual Token Pruning via Diversity and Task Awareness for Efficient Large Vision-Language Models
Viaarxiv icon